Coordinating multi-agent reinforcement learning with limited communication
نویسندگان
چکیده
Coordinated multi-agent reinforcement learning (MARL) provides a promising approach to scaling learning in large cooperative multiagent systems. Distributed constraint optimization (DCOP) techniques have been used to coordinate action selection among agents during both the learning phase and the policy execution phase (if learning is off-line) to ensure good overall system performance. However, running DCOP algorithms for each action selection through the whole system results in significant communication among agents, which is not practical for most applications with limited communication bandwidth. In this paper, we develop a learning approach that generalizes previous coordinated MARL approaches that use DCOP algorithms and enables MARL to be conducted over a spectrum from independent learning (without communication) to fully coordinated learning depending on agents’ communication bandwidth. Our approach defines an interaction measure that allows agents to dynamically identify their beneficial coordination set (i.e., whom to coordinate with) in different situations and to trade off its performance and communication cost. By limiting their coordination set, agents dynamically decompose the coordination network in a distributed way, resulting in dramatically reduced communication for DCOP algorithms without significantly affecting overall learning performance. Essentially, our learning approach conducts co-adaptation of agents’ policy learning and coordination set identification, which outperforms approaches that sequence them.
منابع مشابه
Voltage Coordination of FACTS Devices in Power Systems Using RL-Based Multi-Agent Systems
This paper describes how multi-agent system technology can be used as the underpinning platform for voltage control in power systems. In this study, some FACTS (flexible AC transmission systems) devices are properly designed to coordinate their decisions and actions in order to provide a coordinated secondary voltage control mechanism based on multi-agent theory. Each device here is modeled as ...
متن کاملLearning with Whom to Communicate Using Relational Reinforcement Learning
Relational reinforcement learning is a promising direction within reinforcement learning research. It upgrades reinforcement learning techniques by using relational representations for states, actions, and learned value-functions or policies to allow natural representations and abstractions of complex tasks. Multiagent systems are characterized by their relational structure and present a good e...
متن کاملLearning Complex Swarm Behaviors by Exploiting Local Communication Protocols with Deep Reinforcement Learning
Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. Although there have been recent advances of deep RL algorithms applied to multi-agent systems, learning communication protocols while simultaneously learning the behavior ...
متن کاملCoordination in multiagent reinforcement learning systems by virtual reinforcement signals
This paper presents a novel method for on-line coordination in multiagent reinforcement learning systems. In this method a reinforcement-learning agent learns to select its action estimating system dynamics in terms of both the natural reward for task achievement and the virtual reward for cooperation. The virtual reward for cooperation is ascertained dynamically by a coordinating agent who est...
متن کاملIndeterminacy Reduction in Agent Communication Using a Semantic Language
In recent years, the importance of vagueness and uncertainty in the messages exchanged between agents has been highlighted mainly due to the ubiquitous nature of the (artificial or human) agents’ communication. The imprecision in the communication becomes more significant when the autonomy of the agents increases or the number of exchanged messages for a communicative goal is limited. In this p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013